07. Filter, Drop Nulls, Dedupe
Filter, Drop Nulls, Dedupe
1. Filter
For consistency, only compare cars certified by California standards. Filter both datasets using
query
to select only rows where
cert_region
is
CA
. Then, drop the
cert_region
columns, since it will no longer provide any useful information (we'll know every value is 'CA').
2. Drop Nulls
Drop any rows in both datasets that contain missing values.
3. Dedupe
Drop any duplicate rows in both datasets.
Workspace
This section contains either a workspace (it can be a Jupyter Notebook workspace or an online code editor work space, etc.) and it cannot be automatically downloaded to be generated here. Please access the classroom with your account and manually download the workspace to your local machine. Note that for some courses, Udacity upload the workspace files onto https://github.com/udacity , so you may be able to download them there.
Workspace Information:
- Default file path:
- Workspace type: jupyter
- Opened files (when workspace is loaded): n/a
QUIZ QUESTION: :
Match the values for the following features about the new dataset after filtering by certification region.
ANSWER CHOICES:
Feature |
Value |
---|---|
14 |
|
798 |
|
13 |
|
823 |
|
10 |
|
2404 |
|
1611 |
|
1084 |
SOLUTION:
Feature |
Value |
---|---|
798 |
|
13 |
|
1084 |